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ABSTRACT 

We consider a powerful iterative inference algorithm which 
has recently appeared in the literature, see e.g. [1, 2, 3, 4, 5]. 
In this paper, we refer to this algorithm as iterative "pseudo 
likelihood" maximization (IPLM) algorithm. We give a con- 
nection between this algorithm and the problem of Bethe free 
energy minimization and prove several important results con- 
cerning its fixed points and its convergence properties. 

Index Terms — MAP estimation, iterative methods, con- 
vergence of numerical methods. 

1. INTRODUCTION 

In this paper, we consider the problem of inferring the value 
of an unknown parameter Q from an observation vector Y, 
when Y also depends on a "nuisance" parameter vector X = 
[Xi, X2, . . . , Xjv]. In such a scenario, a direct computation of 
the maximum a posteriori (MAP) estimate often turns out to 
be a complex task. In order to circumvent this problem, pow- 
erful numerical methods, enabling to iteratively compute the 
MAP solution, have been proposed in the literature. For ex- 
ample, the expectation-maximization (EM) algorithm [6] or 
the family of gradient methods [7] are instances of such al- 
gorithms. More recently, iterative estimation methods based 
on factor graphs (FGs) and the belief -propagation (BP) algo- 
rithm [8] have appeared in the literature, see e.g. [1, 2, 3]. Al- 
though slightly different in their implementation, these meth- 
ods have the common feature of computing a sequence of es- 
timates {^'"'l^o ^y increasing at each iteration a "pseudo" 
likelihood function (PLF); the latter likelihood function being 
built by considering standard BP messages as a priori infor- 
mation on the nuisance parameters. In the remainder of this 
paper, we will therefore refer to this kind of algorithm as iter- 
ative "pseudo likelihood" maximization (IPLM) algorithm. 

In [3], the authors proposed to maximize the PLF by 
means of the EM algorithm. Considering this particular im- 
plementation in cycle-free FGs, they showed that if only one 
EM iteration is performed, one recovers the standard imple- 
mentation of the EM algorithm, proving as a by-product that 
the fixed points of this particular IPLM algorithm must be sta- 
tionary point of the true likelihood function. This conclusion 
was later shown to be valid irrespective of the method used 



to maximize the PLF in two parallel works [4, 5]: in [4] this 
result was shown in the particular context of synchronization 
problems whereas general FGs were considered in [5]. In 
[5], the author also provides a result for cyclic FG, but the 
proposed characterization does not enable a simple interpre- 
tation of the nature of the fixed points. In this contribution, we 
prove several important properties of the IPLM algorithm for 
general FGs by placing the MAP estimation problem in the 
more general framework of Bethe free energy minimization. 
In particular, we show that i) the fixed points of the IPLM 
algorithm must be stationary points of the Bethe free energy 
of the system [9]; ii) any fixed point of the IPLM is also a 
fixed point of the EM algorithm; Hi) we formulate necessary 
and sufficient conditions for local convergence of the IPLM 
algorithm. As a corollary, we show that the IPLM algorithm 
never converges to maxima of the Bethe free energy. 

2. BETHE FREE ENERGY MINIMIZATION 

In order to generalize the properties of the IPLM algorithm to 
general FGs, we will consider the general problem of Bethe 
free energy minimization. Assume that 
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where Va C {1, 2, . . . , TV} andXv„ is a vector made up of the 
elements of X whose index is in V^. Let us consider the FG 
associated to (1) where @ is not a variable node but simply 
a parameter of the factor nodes (i.e. only Xi, . . . , Xjv are 
variable nodes in the FG). If the FG is cycle free, it is well- 
known [9] that 

log7'e,Y(6',y) = -Ge^B^{^Va),Bi{xi) i^,Ki^vJ,b*{xi)) , 

where G^g^ij^^, )^Bi{xi){') is the Bethe free energy associ- 
ated to the FG and 6*(xy^) (resp. b*(xi)) are the beliefs com- 
puted by the BP algorithm [8] at the factor (resp. variable) 
nodes of the FG, i.e. 

6*(xyJ = 7-i*Xv„,e(xy„,0) n mi^a{xi,e) (2) 

ieVa 

bU^i) = li'l[«^a^i{^i,0), (3) 

aePi 



where m.a^i{xi, 6) and 111^^0(2;^, 9) are the messages com- 
puted by the BP algorithm. Pi is the set of factor nodes con- 
nected to variable node i, and 7^, 7^ are normalization factors. 
In the cycle free case, we see that minimizing the Bethe free 
energy with respect to is equivalent to compute the MAP 
estimate. In the sequel, we will therefore focus on the follow- 
ing more general problem: 



0* = argraax Lq{9)^ 



(4) 



where 



Le{0) = -Ge,B^{^vJ,Biixi) (6',6*(xvJ,6*(xi)) . (5) 

Problem (4) has the following interpretation: when the FG as- 
sociated to (1) is cycle free, 9* is the MAP solution; otherwise 
it is an approximation the MAP solution. 

3. THE IPLM ALGORITHM: PROPERTIES 

In this section, we will derive several properties of the IPLM 
algorithm. We proceed in two steps. We first define the 
"pseudo" likelihood function (PLF) and emphasize some of 
its properties. Based on these results, we will then study the 
convergence properties of the IPLM algorithm. 

3.1. PLF: Definition and Properties 

Before giving a definition of the PLF, we need to define some 
notations and concepts. 

Definitions: A region 7?. of a FG is defined by a set of factor 
nodes and the set of all variable nodes that are connected to 
them. A covering set il is a set of regions such that all factor 
nodes in the FG are included in one and only one region of the 
set. A variable node i is said to be a boundary node if there 
exists at least one factor node a such that a ^ TZ and a £ Pi. 

Notations: y^j. (resp. Ptj) is the set of index of variable nodes 
(resp. factor nodes) belonging to region TZ\ V^ is the set of 
index of the boundary variable nodes belonging to region TZ. 

Let rj be a covering set of cycle-free regions. We define the 
pseudo likelihood function associated to 11 as 
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where 
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$x^,e'(x7^,e')= n n ^a^ii^i,0'), (8) 
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i.e. $XTC,e(x7j, 9) is equal to the product of the factors be- 
longing to 7?. and $XTC,e(xK, 9) is equal to the product of the 
messages entering the boundary variable nodes of TZ. 

Let us now prove some properties of the region-based free 
energy which will later prove to be useful in the analysis of 
the convergence of the IPLM algorithm. 

Property 3.1: If 6' = 9, we have 



VeLei9) = VeG^Q,i9,9). 



(9) 



Proof: On the one hand, taking into account the expression of 
6*(xv'„) and bi{xi) in (2), (3) and applying standard deriva- 
tion rules, we obtain (see, e.g. [10]): 

VeLe {0) = ^eGQ^B^{^v^),B,{x.) (0,6*(xyJ,6*(a;i)) 

= J2T.K (xy„ ) Ve log *xv„ ,e (xy„ , 9) . 

(10) 

On the other hand, taking the derivative of (6) with respect to 
Q and using the fact that Vo log /e = ? , we get 
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= E E E^x«.e.e'(^'^'^'^')^elog*Xv„,e(xy„,e), 
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where 



&XK,e,e'(x-K,6', 61') 



Ex^ *x«,e(xK, e)$x^,e'(xK, 9') ' 

(14) 



Now, since 11 is a covering set of regions, we have that 
E^enEaePK = T,tU- Moreover, since the regions are 
cycle free, it can readily be shown starting from (14) and 
using the definition of the BP message update rules [8] that 

E &x^,e,0'(x7^,0,0) = 6*(xyj, (15) 

where. X-j^" is the set of possible values for X^j. when Xy^ = 
xy^. Therefore, plugging (15) into (13) and comparing with 
(10) we get (9). D 

This first property of the PLF is very interesting since it states 
that the Bethe free energy and the PLF have locally the same 
first order behavior. As we will see in the next section, this 



property will reveal to be key in the characterization of the 
fixed points of the IPLM algorithm. 

Property 3.2: The Hessian of Lq{9) may be expressed as 

V|Le(«) = V|G§_e'(^,^) + Ve,e'G§_e'(^,^), (16) 

where VlCg 9,(61, 6*) and Ve.e'Cg 0,(6*, 6*) are defined in 
(17) and (18).' 

Proof: Starting from ( 1 2), we have 
V|Gg,0,(0,0') 

TZefl xtc 

+ Y. ^fcx^,o,e'(x7^,0,0')V|log'I'x^,e(x7^,^). 

TZefl xtc 



and 9f is a stationary point of the Bethe free energy. 



D 



(19) 

Now, using the definition of 6xTC,e,0'(x7^, S, d') and the fact 
that Vo log /e = ? , we obtain after some calculus 

Ve&x^,e,0'(x7^,^,0') 
= &X7J,o,e'(x7^, e, 6') {Vq log *x,^,e(xTC, 9) 
- J2 ^x^,e.e'(x^, 9, 9')Ve log *x^,e(x^, 9)) . (20) 

Plugging (20) into (19), we get (17). Proceeding in the same 
way, we can get similar expressions for \/ e ^q' Gq qi {9 , 9) 
and V| Le (6*) and prove (16). D 

4. CHARACTERIZATION OF THE CONVERGENCE 
OF THE IPLM ALGORITHM 

In this section, we study the properties of the following itera- 
tive algorithm: 
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(21) 



It is easy to see that the definition of (21) includes the IPLM 
algorithms considered in [1, 2, 3]. Note also that (21) is equiv- 
alent to the "Hybrid-EM algorithm" considered in [5] when 
none of the nodes follow the so-called "E-log" rule [5]. 

Let us now prove some interesting properties of (21): 

Property 4.1: Let 9f be a fixed point of (21). Then we have 

VoGe,B„(x,,j,i3,(.,) i9f,b*{xvj,b*{xi)) = 0. (22) 

Proof: If 9f is a fixed point, then we must have 

VeGg_e'(^/,^/) = 0. (23) 

Now, from (9) it also implies 

VeGe,B„(,,^j,B,(,,) i9f, 6*(xyJ, b*{xi)) = 0. (24) 



We see that property 4. 1 gives a nice interpretation of the fixed 
point of the IPLM algorithm in terms of stationary point of 
the Bethe free energy. Property 4.1 basically states that any 
fixed point of the IPLM algorithm must be stationary point 
of the Bethe free energy. This feature is of course highly 
desirable since any solution of (4) must also cancel the first 
derivative of the Bethe free energy. Note that in the particu- 
lar case of FGs without cycles, the Bethe free energy is equal 
to — logpY,0(y, 0) and we therefore recover the result previ- 
ously proved in [4, 5]. 

The next property relates the fixed points of the IPLM al- 
gorithm to those of the EM algorithm: 

Property 4.2: Let To denotes the set of fixed points of (21) 
and let Tem denote the set of fixed points of the EM algo- 
rithm'. Then, we have 



Tg Q ^em- 



(25) 



Proof: We must show that if 9f is a fixed point of (21) then it 
is also a fixed point of the EM algorithm. Now, any 9f which 
satisfies the two following sufficient conditions [10]: 

Y, Y ^XK,e,e'(xK, 9f, 6'/)V© log *XK,e{x7^, 9f) = 0, 

TZefl xtc 

(26) 

(27) 

is a fixed point of the EM algorithm. From properties 3.1, 4. 1 
and (12), we know that the first condition is fulfilled for any 
fixed point of (21). Let us show that any fixed point of (21) 
also satisfies the second one. If 9f is a fixed point of (21), 
then 



V|Gg«,(0,0)^O. 



(28) 



Now, we have 

^ Y I3^x^.e(x7^,%)V|log*x^,e(x7^,0/). (29) 
•Ken xk 

since the last two terms in (17) form a definite positive ma- 
trix. As a consequence, we see from (29) that (28) also im- 
plies (27). D 

Property 4.3: The IPLM algorithm never locally converges 
to minima of Le(^). Moreover, it /oca/fy converges to a max- 
imum of Lq{9), say 9^, if and only if: 

Ve,e'Ge,e.(0™, 9^) >- v| Ge^&{9^, 9„,). (30) 



If the FG is cycle free, we refer to the standard EM algorithm [6]. When 
the FG contains cycles, we consider the "approximate EM algorithm" de- 
scribed in [11], i.e., the E-step is performed by applying the BP algorithm on 
the cyclic FG. 



+ E ^X7i,e,e'{x7^, 61, e')Ve log ^x^.eCx^, 6*) Ve log *x7j,e(x7j, 6*) 

- E ^x,,,e,e'(xTC, 61, e')Ve log *XK,e(xK, 6*) ^ ^x,,,o,e'(xK, 6^, e')Ve log ^x^.oCxtc, 6I) 



(17) 



- E ^x^,e,e'(x7^, 6i, e')Ve log *XK,e(x7^, 6*) ^ byL^,e,e'i^n, 0, e')Ve> log $x^,e'(x7^, 0'] 
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Proof: In order to prove property 4.3, let us consider the fol- 
lowing condition of convergence: 



where I is the unitary matrix and (see [12]) 



(31) 



R-g(6'/) — (-VQGe,o'(6'/,^/)) Ve,e'G'e,e'(^/,6'/). 

(32) 

is the (local) rate of convergence of (21) around Of. We will 
show that (31) is never satisfied for minima whereas it is sat- 
isfied for maxima if and only if (30) is satisfied. 

Taking into account that Vq Ge,e'(^/i ^/) -< for any 
fixed point and using (32) and (16), (31) may be equivalently 
rewritten as 



2V|Ge,e'(^/,^/)^ V|ie(0)^O. 



(33) 



Based on this expression we can draw the two following con- 
clusions. First, if Of is a minimum of Lq{0), then the IPLM 
algorithm locally diverges from Of. Indeed, if Of corresponds 
to a minimum, it implies VgLeC^) >- 0. Therefore, the 
second inequality in (33) is violated and the algorithm does 
not converge to Of. On the other hand, if Of corresponds to a 
maximum of Le(^), ^%LQ{Of) -< and the second inequal- 
ity in (33) is always satisfied. The (local) convergence to Of 
is therefore ensured if and only if 



2VlGe^e,{0f,ef) <VlLe{Of) 
which is equivalent to (30) by using (16). 



(34) 
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In property 4.1, we saw that some fixed points of the 
IPLM algorithm can possibly correspond to maxima of the 
the Bethe free energy. From property 4.3, we see that even if 
a maximum of the Bethe free energy is a fixed point of (21), 
the algorithm will not converge to it. Moreover, property 4.3 
provides necessary and sufficient conditions (30) for conver- 
gence to the minima of the Bethe free energy. This condition 
is however difficult to check in practice since it basically re- 
quires the complex task of evaluating Vq LsiOf) (see (34)). 



However, it gives an interesting insight of when the IPLM 
algorithm is likely to converge by looking at the expression 

'')in(17), (18). 
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